Speaker Extraction With Co-Speech Gestures Cue

نویسندگان

چکیده

Speaker extraction seeks to extract the clean speech of a target speaker from multi-talker mixture speech. There have been studies use pre-recorded sample or face image as cue. In human communication, co-speech gestures that are naturally timed with also contribute perception. this work, we explore sequence, e.g. hand and body movements, cue for extraction, which could be easily obtained low-resolution video recordings, thus more available than recordings. We propose two networks using perform attentive listening on speaker, one implicitly fuses in process, other performs separation first, followed by explicitly associate separated speaker. The experimental results show is informative associating

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory

One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...

متن کامل

Single-speaker/multi-speaker co-channel speech classification

The demand for content-based management and real-time manipulation of audio data is constantly increasing. This paper presents a method to identify temporal regions, in a segment of co-channel speech, as being either single-speaker or multispeaker speech. The state of the art approach for this purpose is the kurtosis. In this paper, a set of complementary time-domain and frequency-domain featur...

متن کامل

Co-speech gestures do not originate from speech production processes: Evidence from the relationship between co-thought and co-speech gestures

When we speak, we spontaneously produce gestures (cospeech gestures). Co-speech gestures and speech production are closely interlinked. However, the exact nature of the link is still under debate. To addressed the question that whether co-speech gestures originate from the speech production system or from a system independent of the speech production, the present study examined the relationship...

متن کامل

Co-channel Speech and Speaker Identification Study

This study was comprised of two parts. The first was to determine the effectiveness of speaker identification under two different speaker identification degradation conditions, additive noise and speaker interference, using the LPC cepstral coefficient approach. The second part was to develop a method for determination of co-channel speech, i.e., speaker count, and to develop an effective metho...

متن کامل

Co-channel speaker identification using usable speech extraction based on multi-pitch tracking

Recently, usable speech criteria [1] are proposed to extract minimally corrupted speech for speaker identification (SID) in co-channel speech. In this paper, we propose a new usable speech extraction method to improve the SID performance under the co-channel situation based on the pitch information obtained from a robust multi-pitch tracking algorithm [2]. The idea is to retain the speech segme...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE Signal Processing Letters

سال: 2022

ISSN: ['1558-2361', '1070-9908']

DOI: https://doi.org/10.1109/lsp.2022.3175130